Perfect hashing using sparse matrix packing

نویسندگان

  • Marshall D. Brain
  • Alan L. Tharp
چکیده

This article presents a simple algorithm for packing sparse 2-D arrays into minimal I-D arrays in O(r?) time. Retrieving an element from the packed I-D array is O(l). This packing algorithm is then applied to create minimal perfect hashing functions for large word lists. Many existing perfect hashing algorithms process large word lists by segmenting them into several smaller lists. The perfect hashing function described in this article has been used to create minimal perfect hashing functions for unsegmented word sets of up to 5000 words. Compared with other current algorithms for perfect hashing. this algorithm is a significant improvement in terms of both time and space efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Tries to Eliminate Pattern Collisions in Perfect Hashing

4any current perfect hashing algorithms suffer from the problem of pattern collisions. In this paper, a perfect hashing technique that uses array-based tries and a simple sparse matrix packing algorithm is introduced. This technique eliminates all pattern collisions, and because of this it can be used to form ordered minimal perfect hash functions on extremely large word lists. This algorithm i...

متن کامل

A Letter-oriented Perfect Hashing Scheme Based upon Sparse Table Compression

In this paper, a new letter-oriented perfect hashing scheme based on Ziegler’s row displacement method is presented. A unique n -tuple from a given set of static letter-oriented key words can be extracted by a heuristic algorithm. Then the extracted distinct n -tuples are associated with a 0/1 sparse matrix. Using a sparse matrix compression technique, a perfect hashing function on the key word...

متن کامل

Indexing Internal Memory with Minimal Perfect Hash Functions

A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are wide...

متن کامل

Sparse signal recovery using sparse random projections

Sparse signal recovery using sparse random projections

متن کامل

Feature Hashing for Language and Dialect Identification

We evaluate feature hashing for language identification (LID), a method not previously used for this task. Using a standard dataset, we first show that while feature performance is high, LID data is highly dimensional and mostly sparse (>99.5%) as it includes large vocabularies for many languages; memory requirements grow as languages are added. Next we apply hashing using various hash sizes, d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Syst.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 1990